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Abstract. We study numerically the cluster structure of random ensembles of two NP-hard 
optimization problems originating in computational complexity, the vertex-cover problem and 
the number partitioning problem. We use branch-and-bound type algorithms to obtain exact 
solutions of these problems for moderate system sizes. Using two methods, direct neighborhood- 
based clustering and hierarchical clustering, we investigate the structure of the solution space. 
The main result is that the correspondence between solution structure and the phase diagrams 
of the problems is not unique. Namely, for vertex cover we observe a drastic change of the 
solution space from large single clusters to multiple nested levels of clusters. In contrast, for the 
number-partitioning problem, the phase space looks always very simple, similar to a random 
distribution of the lowest-energy configurations. This holds in the "easy" /solvable phase as well 
as in the "hard" /unsolvable phase. 



1. Introduction 

For NP-hard optimization problems [1] no algorithm is known which solves the problem in the 
worst case in a time which grows less than exponentially with the problem size. From a physical 
point of view, this implies the notion that the structure of the underlying solution space is 
probably organized in a complicated way, e.g. characterized by a hierarchical clustering. Such 
clustering has already been observed in statistical physics when studying spin glasses [21 El SI E] - 
For the mean-field Ising spin glass, also called Sherrington-Kirkpatrick (SK) model |6], the 
solution exhibits replica-symmetry breaking (RSB) |8], which means that the state space is 
organized in an infinitely nested hierarchy of clusters of states, characterized by ultrametricity 
PJ. Also in numerical studies the clustering structure of the SK model has been observed, e.g. 
by calculating the distribution of overlaps pUl El E] , when studying the spectrum of spin-spin 
correlation matrices |13|. [H] or when applying direct clustering [15J. On the other hand, for 
models like Ising ferromagnets it is clear that they do not exhibit RSB and all solutions are 
organized in one cluster (resp. two, when including spin-flip symmetry). 

The use of notions and analytical tools from statistical mechanics enabled physicists recently 
to contribute to the analysis of these NP-hard problems that originate in theoretical computer 
science. Well known problems of this kind are the satisfiability (SAT) problem [16} \TT\ I18j . 
number partitioning [19J [20], graph coloring [21], and vertex cover [221 [23 [23 [23 126] . 
Analytically, the phase diagrams of these problems on suitably defined parametrized ensembles 



of random instances can be studied using some well-known techniques from statistical physics, 
like the replica trick (27J EH] , or the cavity approach [18] . But full solutions have not been found 
in most cases, since these problems are usually not defined on complete graphs but on diluted 
graphs, which poses additional technical problems. Usually, one can only calculate the solution 
in the case of replica symmetry [27} l28l [TB] , or in the case of one-step replica symmetry breaking 
(1-RSB) [ T71I18] . and look for the stability of the solutions. For this reason, the relation between 
the solution and the clustering structure is not well established and it is far from being clear 
for most models how the clustering structure looks like. However, most statistical physicist 
believe that the failure of replica symmetry (RS) leads indeed to clustering [29, 30J. So far, 
only few analytical studies of the clustering properties of classical combinatorial optimization 
problems like SAT have been performed 13 lj . These results depend or may depend on the 
specific assumptions one makes when applying certain analytical tools and when performing 
approximations. In particular, it is unlikely that the clustering of models on dilute graphs is 
exactly the same as it is found for the mean-field SK spin glass. So from the physicist's point 
of view, it is quite interesting to study the organization of the phase space using numerical 
methods to understand better the meaning of "complex organization of phase space" for other, 
non-mean-field models, like combinatorial optimization problems. It is the aim of this paper to 
study numerically the clustering properties of two particular problems, the vertex-cover problem 
and the number partitioning problem. 

The study of the solution structure is not only important for physicists, but also of interest 
for computer science. From an algorithmic point of view, especially the solution-space structure 
seems to play an important role. However, as we will see in this work, a direct one-to-one 
correspondence between cluster structure and hardness of the problem cannot be described at 
the moment. 

2. Models 

In this work, we deal with two classical problems [T] from computational complexity, the vertex- 
cover problem (VC) and the number partitioning problem (NPP). 

VC is defined for an arbitrary given graph G = (V,E), V being a set of N vertices and 
E C V x V a set of undirected edges. Let V C V be a subset of all vertices. We call a vertex 
v covered if v € V, uncovered if v ^ V . Similarly an edge {i,j} is covered if at least one of 
its endpoints i, j is covered. If all edges of G are covered, then we call V a vertex cover Vyc- 
We denote X = \V'\ and x = X/N. We describe each subset V by a configuration vector 
x G {0,1}^ with Xi = 1 ^> i £ V. For a graph G = (V,E) the minimal VC problem is the 
following optimization problem: Construct a vertex cover Vyc-min C V of minimal cardinality 
and find its size X m j n = \Vvc-min\- Usually there are many solutions of the same size, hence 
the solutions are degenerate. 

Here we study VC on an ensemble of random graphs with V = {1, 2, . . . N} and |iV randomly 
drawn, undirected edges {i,j} € E. In this notation c is the connectivity, i.e. the average 
number of edges each vertex is contained in. For this ensemble, one can calculate the average 
fraction X m i n /N of vertices that have to be covered for minimum vertex covers by using methods 
from statistical mechanics |22j like the replica approach [3j 2] , the cavity approach [25] or by 
an analysis of a heuristic algorithm, called leaf-removal [32J, which approximately solves the 
problem [33J. The main result is that for c < e ~ 2.718, x m ; n can be obtained exactly, while 
for c > e only approximate solutions have been found so far. Within the statistical mechanics 
treatment [22], this means that the solutions for c < e are RS, while for larger connectivities 
RSB appears. It has been found |34j that the onset of RSB can be seen numerically from 
analyzing the cluster structure of the solution space. Note that the leaf-removal heuristic |32j 
allows, in conjunction with an exact branch-and-bound approach [35j, to solve VC on random 
graphs for c < e typically in polynomial time. For c > e, one needs typically an exponential 



running time. Hence, in this case, the onset of a complex solution landscape, visible analytically 
|22j and numerically as presented in this work, coincides with an "easy-hard transition" of the 
typical computational complexity. 

The NPP is defined for a given set A of N natural positive M-bit numbers < a\ < 2 M — 1. 
The NPP is the optimization problem to find a partition of A into two subsets A' and A\A' 
such that the difference (or energy) E = \ J2 ai eA' a * ~~ J2 ai eA\A' a «l * s minimal. Similar to VC, 
a partition can also be described by a vector x £ {0, 1}^ with, in this case, Xi = 1 <-> a, G A'. 
Note that partitions with E = are called perfect. 

Here we study NPP on an ensemble of random numbers, which are distributed equally in 
[0,2 M - 1]. It has been observed numerically |36j . within a statistical mechanics treatment jTS] 
and also using an exact mathematical analysis [37], that for n = M/N larger than a critical 
value (N) (k c — * 1 for N — > oo) almost surely no perfect partitions exist, while for 

K < k c (N) there are typically exponentially many perfect partitions. This transition coincides 
with an "easy-hard transition" of an exact algorithm [38J, i.e. for a fixed value n < k c perfect 
partitions can be found typically in a time polynomially increasing with N , while for k > k c , 
it takes an exponentially growing running time to find the minimum partition. Below, we will 
analyze the cluster structure of NPP. To study the behavior as a function of system size, since 
k c depends somehow on N, we use k = k/k c (N) as parameter indicating the distance from the 
phase boundary. Our main result is that the behavior does not differ significantly below and 
above k = 1, in contrast to the results for VC. 

3. Algorithms 

We apply exact algorithms, which guarantee to find the optimum solution. These algorithms 
are based on the branch-and-bound approach [39]. The basic idea is, as each variable Xi of a 
configuration is either or 1, there are 2 N possible configurations which can be arranged as 
leaves of a binary (branching) tree. At each node of the tree, the two subtrees represent the 
subproblems where the corresponding variable is either or 1. The algorithm constructs this 
tree partially, while searching for a solution. A subtree will be omitted if its leaves can be proven 
to contain less favorable configurations than the best of all previously considered configurations, 
this is the so-called bound. The actual performance of a branch-and-bound algorithm depends 
strongly on the heuristic which is used to decide in which order the variables are assigned and 
on the bounds used. Both have to be chosen according to the problem under consideration. 

For VC, we use the branch-and-bound algorithm presented in references [34^ [3"51 !4Uj . which 
is based on reference [41] . For NPP, we apply the Korf algorithm [3B] with a bound described 
in reference [42] . 

4. Clustering Algorithms 

We apply two methods to study the cluster structure of the solution landscape, neighborhood- 
based clustering and hierarchical clustering. 

The neighborhood-based approach is based on the hamming distance between different 
solutions. The hamming distance distham{x^ a \x^) = d a p of two solutions is the number 

of variables in which the two configurations differ, i.e. d a p = ^ilx^ — jc^ |. If for two 
optimal solutions their hamming distance is not larger than a given value cf max , we will call 
them neighbors. For VC, we use d max = 2 (the minimal possible distance of nonidentical 
configurations), while for the NPP, c? max will depend on the system size N, see below. 

We define a cluster C as maximal set of solutions, that can be reached by repeatedly moving 
to neighboring solutions. States which belong to different clusters are separated by a hamming 
distance of at least d max + 2 for VC resp. <i max + 1 for NPP. Similar definitions of clusters have 
been used e.g. for the analysis of random p-XOR-SAT [3T] or finite-dimensional spin glasses 
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To decide whether two arbitrary solutions x^ and x^' belong to the same cluster or not, 
one needs to calculate the complete cluster x^ (or x^) belongs to. Hence the clustering is 
very expensive. 

The neighborhood-based algorithm is as follows, given a complete set S of all solutions, 
begin 

i = {number of so far detected clusters} 

while S not empty do 

begin 

i = i + l 

remove an element x^ from S 

set cluster Cj = (x^) 

set pointer x^ to first element of Cj 

while <> NULL do 

begin 

for all elements x^ of S 

if dham(x^\xW) < d mSLX then 
begin 

remove x^ from S 

put x*W at the end of Cj 
end 

set pointer to next element of Cj 
or to NULL if there is no more 
end 
end 
end 

The crucial point is that one really needs to consider all solutions and not just a sample. The 
algorithm is quadratic in the number of solutions {x^}, which makes the method applicable to 
system sizes, depending on the actual problem, of the order of ./V ~ 100. 

As an alternative method, we will use a clustering approach that organizes the states in a 
hierarchical structure. Such clustering methods [43] are widely used in general data analysis, 
sometimes also used in statistical mechanics, see e.g. references [45j H6j [15] . The methods all 
start by assuming that all states belong to separate clusters. Similarity between clusters (and 
states) is defined by a measure called proximity matrix d a ,p- At each step two very similar 
clusters are joined and so a hierarchical tree of clusters is formed. As proximity measure for two 
initial clusters, each containing only a single state, we naturally choose the hamming distance 
between these two states as defined above, divided by the number of vertices: d a ^ = d a p/N. 
At each step the two clusters C a and Cp with the minimal distance are merged to form a new 
cluster C 7 . Then the proximity matrix is updated by deleting the distances involving C a and 
Cp and adding the distances between C 7 and all other clusters C$ in the system. So we need 
to extend the proximity measure to clusters with more than one state, based on some suitable 
update rule which is usually a function of the distances d a> p, d Q; s and dp^. 

The choice of this function is a widely discussed field since it can have a great impact on 
the clustering obtained [U]. It should represent the natural organization present in the data 
and not some artificial structure induced from the choice of the update rule. Here we will use 
Ward's method (also called minimum-variance method) [47] . The distance between the merged 



cluster C 7 and some other cluster Cs is given by 

7 _ (n a + n s ) d a) s + (rip + n s )dp >s - n s d a ^ 

"7,5 — . j v-U 

n a + n/3 + n5 

where n a ,n/3,ns are the number of elements in cluster C a ,Cp,C$, respectively. Heuristically 
Ward's method seems to outperform other update rules. The choice guarantees that at each 
step the two clusters to be merged are chosen in a way that the variance inside each cluster 
summed over all clusters increases by the minimal possible amount. 

The output of the clustering algorithm can be represented as a dendrogram. This is a tree 
with the configurations as leaves and each node representing one of the clusters at different levels 
of hierarchy, see the bottom half of the examples in figure El Note that Ward's algorithm is able 
to cluster any data. Even if no structure were present, the data could always be displayed as a 
dendrogram. Hence, one has to perform additional checks. Here, we use a visual check, i.e. we 
plot the hamming distances as a matrix where the rows and columns are ordered according to 
the dendrogram. 



5. Results 

We first summarize our results [M] for the solution-space analysis for VC. Second, we present 
the data obtained for the NPP. 



5.1. Vertex cover 
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Figure 1. Average number of clusters in the solution space of the largest connected component 
as function of system size. The circle symbols for small system sizes have been obtained by 
clustering complete sets of solutions. For large systems we sampled solutions with a Monte 
Carlo algorithm at large but finite chemical potential [/,, for details see reference |34| . 



For every value of N we sampled 10 4 realizations. For each realization, we considered 
only the largest connected component of the given graph, since the solution-space structures 
of different connected components are independent of each other. The average number of 
clusters, as obtained from the neighborhood-based clustering using d max = 2, as a function 
of the connectivity is shown as circles in figure [TJ For larger system sizes, the number of cluster 
was estimated as described in reference [34J. 




Figure 2. Sample dendrograms of 100 VC solutions for a graph with 400 vertices. Darker 
colors correspond to closer distances. The left one is at c = 2, i.e. in the RS phase. There 
is no structure present. For c = 6 the dendrogram provides a structure, where the solutions 
form clusters. The careful reader may recognize a second or third level of clustering in the right 
picture. 

For c < e the number of clusters remains close to one. For larger values of c the number of 
clusters increases with system size. Apparently the increase is compatible with a logarithmic 
growth as a function of system size. Hence, the change from simple to complex behavior, as 
expected from the analytical results [22] is visible through the cluster analysis. 

The results of the hierarchical clustering for VC are shown in figure [2j Darker colors 
correspond to smaller distances. The figure shows two different realizations: For small values 
of c < e, the system is in the RS phase, only a single cluster is present. For larger values of c, 
the ordering of the states obtained by the clustering algorithm reveals an underlying structure 
which can be seen in the right part of the figure. One can see that the states form groups where 
the hamming distance between the members is small (dark colors) while the distance to other 
states is large. Thus, our results are compatible with clustering being present for realizations 
with c > e. If you look carefully you can see more structure inside the clusters. Multiple levels 
of clustering indicate higher levels of RSB which we expect to be present for these values of c 

[MIES]. 

5.2. Number-partitioning problem 

Next, we consider the NPP, to see whether similar clustering and coincidences can be found there 
as well. For all values of k = k/k c and all sizes N, the Z = 2exp(0.2iV) energetically lowest 
lying configurations were selected. In the solvable phase k < 1, these are perfect partitions, 
while for k > 1 we take the Z configurations with the lowest values of E. In figure EJ the average 



number of clusters obtained using the neighbor-based clustering with d max = yN is shown. In 
all cases, in the "easy" as well as in the "hard" phase, a basically exponential increase of the 
number of clusters is obtained. Note that if instead d max = 2 had been chosen, like in the VC 
case, an even larger number of clusters would result, hence the large number of clusters is not 
an artifact of the choice of d max - Hence, from this result, one is tempted to conclude that the 
solution landscape is very complex. 
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Figure 3. Average number of clusters in the solution space of the K = 2 exp(0.2iV) energetically 
lowest-lying configurations as a function of the system size, for three different values of k. Note 
that d max = \f~N has been used. 

Nevertheless, when one performs the hierarchical clustering with a certain number (200) 
of randomly selected solutions (k < k c ) or the 200 energetically lowest lying configurations 
(k > k c ), one obtains a simple uniform distributions of the distances, without any structure 
visible, see figure 01 This shows, that in both regions of the phase space, the solution space 
basically consists of equally spaced configurations, which have a large distance from each other 
such that they appear as single-configuration clusters. Note that we have also studied larger 
systems using stochastic algorithms, and again no structure was found. Hence, the structure of 
the solution space is in fact simple everywhere, in contrast to the VC. This might coincide with 
the notion that the NPP can be seen [48J as a variant of the random-energy model |49] . where 
each configuration is assigned a randomly and independently selected energy value. 

To support this finding, we have compared the solution-space structure of the NPP to a set 
of random bit strings (RBS), i.e. to a set of configurations with the least possible structure. For 
this purpose, we again use the neighborhood-based clustering with d max = y/N and measure 
the average number of clusters (averaged over 100 realizations) as a function of the number Z 
of configurations included in the clustering. We do this for the NPP, for k = 1.2, where the Z 
energetically lowest lying configurations are considered. For the RBS case, we use Z independent 
strings. For the RBS case, we expect that for small values of Z, the average number of clusters 
increases and is close to Z, because each configuration forms a single cluster. For large and even 
more increasing values of Z, the configurations are lying more dense in the configuration space, 
hence the number of configurations which have a neighbor within the distance d max increases, 
leading to a decrease of the number of clusters. Finally, if Z is large enough, all configurations 
are interconnected, hence one obtains just one single cluster. This behavior is exactly visible 




Figure 4. Sample dendrograms (bottom) and distance matrices (top) of 200 NPP solutions 
for systems with 35 numbers. Darker colors correspond to closer distances. The left one is at 
k = 1.5, i. e. in the hard and unsolvable phase, while the right is for k = 0.6, i.e. in the solvable 
phase, where perfect partitions exist. In both cases there is no structure present. 

in figure [5j Interestingly, the resulting curves for the NPP are following very closely the curves 
for the RBS case, hence one can conclude that the NPP has a cluster structure which is almost 
indistinguishable from a random distribution, supporting the above results. Note that for VC, 
we find different results. The number of clusters first increases a bit, and then decreases to a 
value clearly above one, hence differs strongly from the RBS case. 

6. Summary and Outlook 

We have studied the cluster structure of two combinatorial optimization problems, the vertex- 
cover problem and the number partitioning problem, on suitably defined random ensembles. 
In both cases, the existence of phase transitions which coincide with drastic changes of the 
computational complexity are analytically well established. For many of these problems, exact 
analytical solutions are at least currently out of reach; in particular the cluster structure of 
the solution landscape has been studied only very partially by analytical means. Hence, in this 
work, we analyze the cluster landscape numerically. We calculate exact solutions which we study 
using direct neighborhood clustering and via a hierarchical clustering approach. For VC, the 
"easy-hard" phase transition is visible also in the cluster structure of the solution space, the hard 
phase is dominated by a complicated hierarchy of solutions. Instead, for the NPP, the cluster 
structure is basically equivalent to a random distribution of the solutions in the "easy" / solvable 
as well as in the "hard" /unsolvable phase. Hence, a direct correspondence between the structure 
of the solution space, the solvability, and the hardness of finding a (best) solution does not seem 
to exist. Hence, the NPP seems to be hard in a different way than VC. The reason might be that 
the NPP is a pseudopolynomial problem, i.e. when fixing the number M of bits, the problems 
becomes always quickly solvable for N —* oo . 




Figure 5. Average number of clusters from neighborhood-based clustering (full symbols) in 
the solution space of the Z energetically lowest lying configurations as a function of for 
different values of N. For comparison also the same number of clusters is shown, when instead 
Z random bit strings are used (open symbols), i.e. the assumption of absolutely no order in the 
configuration space. Left: NPP for k = 1.2, d max = y/~N. Right: VC for c = 4, d max = 2. 

To further elucidate possible relationships, other optimization problems should be studied 
in this way in the future. In particular the authors are currently working on the satisfiability 
problem, where several changes of the solution-space structure are expected from approximate 
analytical solutions [16j [30] . 
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